Quick Start - ChartsMaze EDL Pipeline

Run Your First Pipeline

Get from zero to a complete dataset of 2,775 stocks with 86 fields each in under 10 minutes.

Navigate to Pipeline Directory

cd "~/workspace/source/DO NOT DELETE EDL PIPELINE"

The directory name contains spaces, so ensure you use quotes in your shell commands.

Run the Master Pipeline

python3 run_full_pipeline.py

This single command orchestrates 18 scripts across 6 phases:

Phase Breakdown (expand to see details)

Phase 1: Core Data (Foundation)

Fetches 2,775 NSE stocks
Creates master ISIN map
Downloads fundamental data (35 MB)

Phase 2: Data Enrichment

Company filings (hybrid LODR + Legacy)
Live announcements
Advanced technical indicators
Market news (50 articles/stock)
Corporate actions
Surveillance lists (ASM/GSM)
Circuit stocks
Bulk/Block deals
Price band revisions

Phase 2.5: OHLCV History (Optional)

Smart incremental download
Lifetime daily candles
First run: ~30 min | Incremental: ~2-5 min

Phase 3: Base Analysis

Builds master JSON with 60+ base fields

Phase 4: Enrichment (Sequential)

Advanced metrics (ADR, RVOL, ATH, Turnover)
Earnings performance (post-results returns)
F&O data (lot sizes, expiry dates)
Market breadth & relative strength
Corporate events + news feed (LAST)

Phase 5: Compression

GZIP level 9 compression
30 MB → 2-4 MB (85-90% reduction)

Monitor Pipeline Progress

Watch real-time progress with phase labels and timing:

═══════════════════════════════════════════════════════════════
  EDL PIPELINE - FULL DATA REFRESH
═══════════════════════════════════════════════════════════════

📦 PHASE 1: Core Data (Foundation)
────────────────────────────────────────
  ▶ Running fetch_dhan_data.py...
  ✅ fetch_dhan_data.py (12.3s)
  ▶ Running fetch_fundamental_data.py...
  ✅ fetch_fundamental_data.py (45.2s)
  ▶ Downloading NSE Listing Dates...
  ✅ NSE Listing Dates downloaded.

📡 PHASE 2: Data Enrichment (Fetching)
────────────────────────────────────────
  ▶ Running fetch_company_filings.py...
  ✅ fetch_company_filings.py (89.7s)
  ...

✨ PHASE 4: Enrichment (Injecting into Master JSON)
────────────────────────────────────────
  ▶ Running advanced_metrics_processor.py...
  ✅ advanced_metrics_processor.py (15.4s)
  ...

📦 PHASE 5: Compression (.json → .json.gz)
────────────────────────────────────────
  📦 Compressed: 32.4 MB → 3.2 MB (90% reduction)

═══════════════════════════════════════════════════════════════
  PIPELINE COMPLETE
═══════════════════════════════════════════════════════════════
  Total Time:  285.7s (4.8 min)
  Successful:  18/18
  Failed:      0/18

  📄 Output: all_stocks_fundamental_analysis.json.gz (3.2 MB)
  🧹 Only .json.gz + ohlcv_data/ remain. All intermediate data purged.
═══════════════════════════════════════════════════════════════

Verify Output

Check that the compressed output file was created:

ls -lh all_stocks_fundamental_analysis.json.gz

Expected output:

-rw-r--r-- 1 user user 3.2M Mar 03 14:35 all_stocks_fundamental_analysis.json.gz

Extract and Inspect

Decompress and view a sample record:

gunzip -c all_stocks_fundamental_analysis.json.gz | python3 -m json.tool | head -n 100

Or use the built-in single stock analyzer:

python3 single_stock_analyzer.py
# Enter symbol when prompted: RELIANCE

Configuration Options

Customize pipeline behavior by editing run_full_pipeline.py (lines 60-71):

# OHLCV: Auto-detect mode
# True = always fetch (incremental update: ~2-5 min if data exists, ~30 min first time)
# False = skip entirely (ADR, RVOL, ATH, % from ATH fields will be 0)
FETCH_OHLCV = True

# Set to True to also fetch standalone data (Indices, ETFs)
FETCH_OPTIONAL = False

# Auto-delete intermediate files after pipeline succeeds
# Keeps: all_stocks_fundamental_analysis.json.gz + ohlcv_data/
CLEANUP_INTERMEDIATE = True

OHLCV Impact: Setting FETCH_OHLCV = False will result in zero values for these fields:

ADR (Average Daily Range)
RVOL (Relative Volume)
ATH (All-Time High)
% from ATH
200 Days EMA Volume
Returns since Earnings

Understanding the Output

The pipeline produces a JSON array with 2,775 stock objects:

[
  {
    "Symbol": "RELIANCE",
    "Name": "Reliance Industries Ltd.",
    "Listing Date": "29-Nov-1977",
    "Basic Industry": "Refineries",
    "Sector": "Energy",
    "Index": "NIFTY 50, NIFTY 500",
    
    // Fundamentals (35 fields)
    "Market Cap(Cr.)": 1725430.5,
    "Stock Price(₹)": 2567.85,
    "Latest Quarter": "Dec 2025",
    "Net Profit Latest Quarter(Cr.)": 18670.0,
    "EPS Latest Quarter": 27.65,
    "Sales Latest Quarter(Cr.)": 254890.0,
    "OPM Latest Quarter(%)": 12.4,
    // ... 25+ more fundamental fields
    
    // Valuation Ratios (10 fields)
    "P/E": 28.45,
    "Forward P/E": 24.12,
    "ROE(%)": 12.34,
    "ROCE(%)": 15.67,
    "D/E": 0.45,
    // ... 5+ more ratio fields
    
    // Technical Indicators (7 fields)
    "RSI (14)": 62.5,
    "SMA Status": "SMA 20: Above (4.9%) | SMA 50: Above (24.1%)",
    "EMA Status": "EMA 20: Above (6.3%) | EMA 200: Above (72.6%)",
    "Technical Sentiment": "RSI: Neutral | MACD: Bearish",
    "Pivot Point": 2545.50,
    
    // Price Performance (9 fields)
    "1 Day Returns(%)": 1.2,
    "1 Week Returns(%)": 3.4,
    "1 Month Returns(%)": 5.6,
    "1 Year Returns(%)": 24.5,
    "% from 52W High": -8.5,
    "% from ATH": -12.3,
    
    // Volume & Liquidity (6 fields)
    "RVOL": 1.45,
    "Daily Rupee Turnover 20(Cr.)": 850.5,
    "30 Days Average Rupee Volume(Cr.)": 890.2,
    
    // Volatility (4 fields)
    "5 Days MA ADR(%)": 2.3,
    "14 Days MA ADR(%)": 2.1,
    
    // Earnings Tracking (3 fields)
    "Quarterly Results Date": "14-Jan-2026",
    "Returns since Earnings(%)": 8.5,
    "Max Returns since Earnings(%)": 12.3,
    
    // Event Markers (multi-value string)
    "Event Markers": "📊: Results Recently Out | 💸: Dividend (15-Mar)",
    
    // Recent Announcements (array of objects)
    "Recent Announcements": [
      {
        "Date": "15-Jan-2026",
        "Headline": "Board Meeting - Consideration of Quarterly Results",
        "URL": "https://..."
      },
      // ... up to 5 items
    ],
    
    // News Feed (array of objects)
    "News Feed": [
      {
        "Title": "Reliance Industries Q3 profit beats estimates",
        "Sentiment": "positive",
        "Date": "15-Jan-2026 09:45"
      },
      // ... up to 5 items
    ]
  },
  // ... 2,774 more stocks
]

Common Use Cases

Screen Stocks by Fundamentals

import json
import gzip

with gzip.open('all_stocks_fundamental_analysis.json.gz', 'rt') as f:
    stocks = json.load(f)

# Find stocks with:
# - ROE > 15%
# - P/E < 25
# - Market Cap > 1000 Cr
# - 1 Year Returns > 20%

screened = [
    s for s in stocks
    if s.get('ROE(%)', 0) > 15
    and s.get('P/E', 999) < 25
    and s.get('Market Cap(Cr.)', 0) > 1000
    and s.get('1 Year Returns(%)', -999) > 20
]

print(f"Found {len(screened)} stocks matching criteria:")
for stock in screened[:10]:
    print(f"{stock['Symbol']:12} | ROE: {stock['ROE(%)']:5.1f}% | P/E: {stock['P/E']:5.1f}")

Track Corporate Events

# Find all stocks with upcoming dividends
dividend_stocks = [
    s for s in stocks
    if s.get('Event Markers') and 'Dividend' in s['Event Markers']
]

print(f"{len(dividend_stocks)} stocks with upcoming dividends:")
for stock in dividend_stocks:
    print(f"{stock['Symbol']:12} | {stock['Event Markers']}")

Analyze Post-Earnings Performance

# Find stocks with strong post-earnings momentum
strong_earnings = [
    s for s in stocks
    if s.get('Returns since Earnings(%)', 0) > 10
    and s.get('Max Returns since Earnings(%)', 0) > 15
]

# Sort by returns since earnings
strong_earnings.sort(key=lambda x: x['Returns since Earnings(%)'], reverse=True)

print(f"{len(strong_earnings)} stocks with >10% returns since earnings:")
for stock in strong_earnings[:20]:
    print(f"{stock['Symbol']:12} | Since: {stock['Returns since Earnings(%)']:6.1f}% | Max: {stock['Max Returns since Earnings(%)']:6.1f}%")

Monitor Technical Breakouts

# Find stocks with:
# - RSI between 50-70 (momentum but not overbought)
# - Above SMA 50
# - RVOL > 1.5 (high volume)

breakout_candidates = []
for s in stocks:
    rsi = s.get('RSI (14)', 0)
    sma_status = s.get('SMA Status', '')
    rvol = s.get('RVOL', 0)
    
    if (50 < rsi < 70 
        and 'SMA 50: Above' in sma_status 
        and rvol > 1.5):
        breakout_candidates.append(s)

print(f"{len(breakout_candidates)} breakout candidates:")
for stock in breakout_candidates:
    print(f"{stock['Symbol']:12} | RSI: {stock['RSI (14)']:5.1f} | RVOL: {stock['RVOL']:4.2f}")

Next Steps

Pipeline Settings

Learn advanced configuration options

Field Reference

Explore all 86 output fields in detail

Pipeline Architecture

Understand the pipeline design

Data Fetching Scripts

See API reference for all scripts

​Run Your First Pipeline

​Configuration Options

​Understanding the Output

​Common Use Cases

​Next Steps

Pipeline Settings

Field Reference

Pipeline Architecture

Data Fetching Scripts

Run Your First Pipeline

Configuration Options

Understanding the Output

Common Use Cases

Next Steps